1,338 research outputs found
Document Simplicial Complex
A k-simplex is de�ned as k-dimensional geometric structure which is the convex hull
of k+1 points. Given k+1 points x0; :::; xk 2 Rk which are a�nely independent, the
set
C =
(
a0x0 + ::: + akxk
����
Xk
i=0
ai = 1 and ai � 0 for all i
)
;
is de�ned as the k-simplex determined by them. Simplex is a very basic building
structure in abstract topology. Collection of simplexes (or simplices) under certain
condition is called geometrical simplicial complex, which further helps to analyze a
geometrical structure on bigger scale. An abstract simplicial complex is a purely
combinatorial description of the geometric notion of a simplicial complex, consisting
of a family of non-empty �nite sets closed under the operation of taking non-empty
subsets.
A text document can be visualized as a geometric structure in topology. A docu-
ment is de�ned as a collection of words, where each word is considered to be a part of
vocabulary having a certain meaning. And an n-gram is a contiguous sequence of n
items from a given sample of text. Using the n-gram concept to de�ne a simplex we
can construct an abstract simplicial complex out of every text document. Thus from
this model, every simplex catches the local structure or behavior while a document
simplicial complex, which is the collection of all n-1 simplex, captures the global be-
havior of the document. We will study this considering we have a bag of documents
i.e. the universal set of documents.
The aim of this thesis is to understand abstract structure admitted by text doc-
uments to �nd more accurately the similar documents from the given family if text
documents. In our discussion, we will visualize a document as a geometrical entity
and will make use of such representation of a text document to fast the process of
querying, where given a query document one can �nd the semantically similar doc-
uments more e�ciently in the sense of time and similarity. For example, given a
set of documents as f1.\after clearing high school one joins college", 2.\College can
be joined only after passing high school" and 3.\High school and college must be
attended by everyone"g the document 1 and 2 are more semantically similar that 1
and 3 or 2 and 3.
After a brief glance at abstract topology, we study the topological structure and
behavior of text documents. A novel representation of documents is given in this
thesis. Using this new structure of a text document we represent each document as a
geometrical entity which further can be analyzed using topological tools. Using Earth
Mover's distance and Hausdor� distance we give a new formulation to fetch semantic
documents for a given query. To represent documents as a mathematical structure
in some Rk, we use Word2Vec model to �nd vector representation of each word in a
text document
Comparison of satellite image-based vegetation indices for extraction and mapping of litchi (Litchi Chinensis) cultivation area in Muzaffarpur district, Bihar, India
The aim of present study was to evaluate the suitability of various vegetation indices (VIs) to ex- tract litchi cultivation area in Muzaffarpur district of Bihar, India. VIs computed from the multispectral bands of Landsat satellites have been used in delineating litchi cultivation areas from other land cover cate- gories. In this study, ten selected VIs have been applied and compared their effectiveness in litchi cultiva- tion area mapping for years 2016 and 2020 respectively. The results showed that the Normalized Green Blue Difference Index (NGBDI) was found to be most appropriate for extracting and mapping the litchi cultivation area. The area statistics of litchi cultivation was validated and are in closer correspondence with the data reported by the state horticulture department. It was found that the area of Litchi cultivation field is increased from 10272.79 ha to 10400.63 ha during the period of 4 years (2016-2020) in the area under in- vestigation. The spatial distribution maps of litchi fruit represent a vital reference suitable for developing a regional action plan to promote its cultivation and benefits to farmers
- …